feat: add code source references and related page tagging; emits Rela… by notowen333 · Pull Request #832 · strands-agents/docs

notowen333 · 2026-05-11T21:10:29Z

Tag-driven Related Pages

note this change does not change any human facing pages. It just adds related pages and source reference metadata for agents and headless browsers. As a follow up we /may/ add tags at the bottom of pages that group other related pages.

sourceLinks infrastructure (schema, renderer for ## Implementation blocks) is wired but no values are set — pending the upcoming monorepo migration. Re-adoption is purely a frontmatter exercise.

Surfaces

Surface	Where	Cap	Score floor	Audience
`## Related pages` block	appended to body in `/<slug>/index.md` and aggregated in `/llms-full.txt`	10	none	LLMs / `llms.txt` consumers
JSON-LD `relatedLink`	inside the page's `TechArticle` graph node	10	none	HTML-only crawlers

Algorithm

Score is rarity-weighted Jaccard:

score =  Σ (rarity-weight of tags in BOTH pages)
        ─────────────────────────────────────────
         Σ (rarity-weight of tags in EITHER page)

Each tag's rarity-weight is 1 - (pages_with_tag / total_tagged_pages). Common tags (e.g. aws, used by 14 pages) contribute near 0; rare tags (e.g. agentcore, used by 4) contribute near 1.

A 1-tag match on a broad tag ranks below any 2-tag specific match, so coincidental connections sink naturally. A tag that bloats over time loses weight automatically — no manual audit cycle needed to react. Ties break alphabetically by title (deterministic).

Tag registry

src/config/tags.yml — 32 tags grouped by axis (topics / capabilities / lifecycle), validated by Zod at build time. The YAML header documents the rule:

A tag means "this page teaches the named thing." Not "mentions" or "has a section labeled with it."

…with a four-trap checklist drawn from concrete failure modes encountered while tagging the corpus.

Coverage

112 of 123 user-guide pages tagged. 11 are deliberately untagged (umbrella pages, policy docs, pages with no good cross-section bridge).

Concrete output for `safety-security/guardrails`

Source frontmatter:

title: Guardrails
tags: [safety, bedrock, aws]

`/docs/user-guide/safety-security/guardrails/index.md` and the same block inside `/llms-full.txt`

Top 10, ranked by score, no floor:

## Related pages

- [Amazon Bedrock](/docs/user-guide/concepts/model-providers/amazon-bedrock/index.md) (3 shared tags)
- [Amazon Nova](/docs/user-guide/concepts/model-providers/amazon-nova/index.md) (2 shared tags)
- [Nova Sonic](/docs/user-guide/concepts/bidirectional-streaming/models/nova_sonic/index.md) (2 shared tags)
- [Deploying Strands Agents to Amazon Bedrock AgentCore Runtime](/docs/user-guide/deploy/deploy_to_bedrock_agentcore/index.md) (2 shared tags)
- [Python Deployment to Amazon Bedrock AgentCore Runtime](/docs/user-guide/deploy/deploy_to_bedrock_agentcore/python/index.md) (2 shared tags)
- [TypeScript Deployment to Amazon Bedrock AgentCore Runtime](/docs/user-guide/deploy/deploy_to_bedrock_agentcore/typescript/index.md) (2 shared tags)
- [AgentCore Evaluation Dashboard Configuration](/docs/user-guide/evals-sdk/how-to/agentcore_evaluation_dashboard/index.md) (2 shared tags)
- [PII Redaction](/docs/user-guide/safety-security/pii-redaction/index.md) (2 shared tags)
- [Harmfulness Evaluator](/docs/user-guide/evals-sdk/evaluators/harmfulness_evaluator/index.md) (1 shared tag)
- [Refusal Evaluator](/docs/user-guide/evals-sdk/evaluators/refusal_evaluator/index.md) (1 shared tag)

Links are emitted as index.md siblings so an LLM following them stays on the markdown surface.

`<head>` JSON-LD `TechArticle.relatedLink`

Same 10 entries, same ordering, but as canonical HTML URLs:

{
  "@type": "TechArticle",
  "headline": "Guardrails",
  "keywords": "safety, bedrock, aws",
  "relatedLink": [
    "https://strandsagents.com/docs/user-guide/concepts/model-providers/amazon-bedrock/",
    "https://strandsagents.com/docs/user-guide/concepts/model-providers/amazon-nova/",
    /* ... */
  ]
}

HTML "See also" pills

Top 6 from the same ranking, filtered to score ≥ 0.4. On Guardrails this lands as 6 pills; on a thin-tag page (e.g. retry-strategies whose best score is 0.34) the strip is empty, by design.

Safety nets

Zod validation on tag registry — unknown tags fail the build at the page level.
Title-uniqueness lint in test/content-collection.test.ts — fails build on cross-section title collisions (the "two pages named Hooks" failure mode that produces ambiguous pills).
14 algorithm tests in test/related-docs.test.ts covering scoring, tie-breaking, score floor, edge cases. 311 tests total in the repo.

Verification

npm run build — clean, no broken links across 515 pages
npm run typecheck — clean
npm test — 311 pass

Type of Change

New feature

Checklist

I have read the CONTRIBUTING document
My changes follow the project's documentation style
I have tested the documentation locally using npm run dev
Links in the documentation are valid and working

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…ted/Implementation blocks into /<slug>/index.md and /llms-full.txt, and JSON-LD TechArticle keywords + relatedLink into <head>; No visible change; all headless updates

github-actions · 2026-05-11T21:15:42Z

Documentation Preview Ready

Your documentation preview has been successfully deployed!

Preview URL: https://d3ehv1nix5p99z.cloudfront.net/pr-cms-832/docs/user-guide/quickstart/overview/

Updated at: 2026-05-14T17:32:42.175Z

…er facing page; add title uniqueness test

Replaces the earlier raw-overlap algorithm with a specificity-weighted Jaccard scorer plus a confidence floor for the human-facing surface. Algorithm (src/util/related-docs.ts): - Score is rarity-weighted Jaccard: each shared tag's contribution scales with its rarity in the corpus (1 - freq/N), so a shared `bedrock` (8/108) outweighs a shared `aws` (14/108). Self-correcting if a tag bloats over time — its weight automatically drops. - Headless surface (/<slug>/index.md, /llms-full.txt, JSON-LD relatedLink): top 10, no floor. LLMs benefit from wider recall and self-filter. - Human surface ("See also" pill row): top 6 with score >= 0.4 floor. Empty strip is strictly better than a misleading one; the floor catches the spurious 1-broad-tag matches that earlier audits kept finding by hand. - Specificity table memoized on the input array (cheap WeakMap hit). Vocabulary (src/config/tags.yml): - 32-tag registry organized by axis (topics / capabilities / lifecycle). - Inline rule and authoring checklist at the head of the YAML — "tag = teaches, not mentions" with concrete failure-pattern examples drawn from earlier audit rounds. - Build-time Zod validation (src/config/tags.ts); unknown tags fail. Content: - 108 of 123 user-guide pages tagged; 15 deliberately untagged (umbrella/policy pages and known structural cases). - Two audit passes done; broad-tag misuses (production, aws on session-management, tool-execution on model providers, etc.) stripped. Surface design (src/components/RelatedPagesInline.astro): - Outlined pills wrapping to fit content. Small "Related" caption above. - Mounted in MarkdownContent.astro above Starlight's Prev/Next pagination. - No surface-aware filtering — algorithm is unaware of Pagination. Lint: - Title-uniqueness check in test/content-collection.test.ts catches cross-section title collisions (the kind that produce ambiguous "Hooks" pills). 311 tests pass; 14 cover the algorithm directly.

…t labels - Drop "right outcome" / "to bring those pages back" prose from HUMAN_SCORE_FLOOR comment; keep the calibration note and a one-line hint to add sharper tags. - Update describe/it labels in the headless test suite from "Jaccard" / "tag-overlap size" to the current language ("specificity-weighted Jaccard" / "score") so test output reflects the actual algorithm. No behavior change. 14 tests still pass.

The visible pill row at the bottom of user-guide articles is removed. Tag-driven Related Pages remains in the headless surfaces: - `## Related pages` block in /<slug>/index.md and /llms-full.txt - `relatedLink` array in <head> JSON-LD `TechArticle` What this removes: - src/components/RelatedPagesInline.astro (deleted) - humanRelatedUserGuideFor() and its tests - HUMAN_MAX, HUMAN_SCORE_FLOOR constants and the floor docblock - import + render from MarkdownContent.astro Tag authoring rule (src/config/tags.yml header) updated to reflect that the surface is now headless-only. 306 tests pass; build clean; HTML body verified to contain zero traces of the removed strip.

feat: add code source references and related page tagging; emits Rela…

1c4e3c7

…ted/Implementation blocks into /<slug>/index.md and /llms-full.txt, and JSON-LD TechArticle keywords + relatedLink into <head>; No visible change; all headless updates

notowen333 requested a review from zastrowm May 11, 2026 21:10

notowen333 temporarily deployed to auto-approve May 11, 2026 21:10 — with GitHub Actions Inactive

notowen333 requested a deployment to manual-approval May 11, 2026 21:10 — with GitHub Actions Waiting

fix: remove source references in frontmatter; add related pages to us…

1644cc9

…er facing page; add title uniqueness test

notowen333 temporarily deployed to auto-approve May 12, 2026 14:58 — with GitHub Actions Inactive

notowen333 requested a deployment to manual-approval May 12, 2026 14:58 — with GitHub Actions Waiting

zastrowm reviewed May 13, 2026

View reviewed changes

notowen333 temporarily deployed to auto-approve May 14, 2026 04:16 — with GitHub Actions Inactive

notowen333 requested a deployment to manual-approval May 14, 2026 04:16 — with GitHub Actions Waiting

notowen333 requested a deployment to manual-approval May 14, 2026 04:36 — with GitHub Actions Waiting

notowen333 temporarily deployed to auto-approve May 14, 2026 04:36 — with GitHub Actions Inactive

notowen333 temporarily deployed to auto-approve May 14, 2026 17:27 — with GitHub Actions Inactive

notowen333 requested a deployment to manual-approval May 14, 2026 17:28 — with GitHub Actions Waiting

zastrowm approved these changes May 14, 2026

View reviewed changes

notowen333 merged commit 76c487b into strands-agents:main May 14, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add code source references and related page tagging; emits Rela…#832

feat: add code source references and related page tagging; emits Rela…#832
notowen333 merged 5 commits into
strands-agents:mainfrom
notowen333:headless-improvement

notowen333 commented May 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

notowen333 commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tag-driven Related Pages

Surfaces

Algorithm

Tag registry

Coverage

Concrete output for safety-security/guardrails

/docs/user-guide/safety-security/guardrails/index.md and the same block inside /llms-full.txt

<head> JSON-LD TechArticle.relatedLink

HTML "See also" pills

Safety nets

Verification

Type of Change

Checklist

Uh oh!

github-actions Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Documentation Preview Ready

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

notowen333 commented May 11, 2026 •

edited

Loading

Concrete output for `safety-security/guardrails`

`/docs/user-guide/safety-security/guardrails/index.md` and the same block inside `/llms-full.txt`

`<head>` JSON-LD `TechArticle.relatedLink`

github-actions Bot commented May 11, 2026 •

edited

Loading